Natural Language Processing and Machine Translation Encyclopedia of Language and Linguistics, 2nd ed. (ELL2). Machine Translation: Interlingual Methods

نویسندگان

  • Bonnie J. Dorr
  • Eduard H. Hovy
  • Lori S. Levin
چکیده

An interlingua is a notation for representing the content of a text that abstracts away from the characteristics of the language itself and focuses on the meaning (semantics) alone. Interlinguas are typically used as pivot representations in machine translation, allowing the contents of a source text to be generated in many different target languages. Due to the complexities involved, few interlinguas are more than demonstration prototypes, and only one has been used in a commercial MT system. In this article we define the components of an interlingua and the principal issues faced by designers and builders of interlinguas and interlingua MT systems, illustrating with examples from operational systems and research prototypes. We discuss current efforts to annotate texts with interlingua-based information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

Towards An Interlingual Treatment of Modality

Modality is an important, but complex linguistic phenomenon that concerns all levels of language production. NLP research has rather refrained from this subject, but we show that many errors in machine translation systems are directly related to the absence of a proper interlingual treatment of modality. We outline the traces of such a modal interlingua by presenting the “Module of Modality”, p...

متن کامل

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

Interlingual Annotation of Parallel Text Corpora: A New Framework for Annotation and Evaluation

This paper focuses on the next step in the creation of a system of meaning representation and the development of semantically-annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to provide parallel corpora annotated with detailed deep ...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010